今天是第30天討戰的最後一天,謝謝iT邦幫忙再次給我精進自己寫程式碼的能力,其實這次的程式碼都蠻大的,因為都要需要電腦跑好幾天尤其是模型的部分,讓我學習到software development的精髓處,我認為最重要的就是效率,讓我來介紹我最終寫好lstm結合yolo v8對於多隻斑馬魚行為分析的系統,以下是程式碼
import torch
import cv2
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.metrics import mean_squared_error
from yolov5 import YOLOv5
import os
# 初始化YOLOv8模型
yolo_model = YOLOv5("yolov5s.pt", device="cuda") # 使用CUDA加速
# LSTM模型的參數
lookback = 9 # 定義LSTM的時間步長
num_features = 4 # 每個時間步的特徵數量: x, y, w, h (斑馬魚的位置和大小)
lstm_units = 100 # 每層LSTM單元的數量
dropout_rate = 0.2 # dropout比例
# 初始化LSTM模型
lstm_model = Sequential()
# LSTM層
lstm_model.add(LSTM(lstm_units, return_sequences=True, input_shape=(lookback, num_features)))
lstm_model.add(Dropout(dropout_rate))
lstm_model.add(LSTM(lstm_units, return_sequences=True))
lstm_model.add(Dropout(dropout_rate))
lstm_model.add(LSTM(lstm_units, return_sequences=False))
lstm_model.add(Dropout(dropout_rate))
# 全連接層
lstm_model.add(Dense(50, activation='relu'))
lstm_model.add(Dense(num_features, activation='linear'))
# 編譯模型
lstm_model.compile(optimizer=Adam(learning_rate=0.001), loss='mse')
# 資料預處理函數
def preprocess_frames(video_path, lookback):
cap = cv2.VideoCapture(video_path)
frames = []
sequences = []
positions = []
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
# YOLOv8偵測
results = yolo_model.predict(frame)
detected_fish = results.xywh[results.class_names.index('zebrafish')] # 假設已經訓練好YOLO來偵測斑馬魚
if detected_fish is not None:
for fish in detected_fish:
x, y, w, h = fish[:4]
positions.append([x, y, w, h])
# 構建LSTM的輸入序列
if len(positions) >= lookback:
sequences.append(positions[-lookback:])
frames.append(frame)
cap.release()
return np.array(sequences), frames
# 資料分割函數
def split_sequences(sequences, split_ratio=0.8):
split = int(len(sequences) * split_ratio)
return sequences[:split], sequences[split:]
# 加載視頻並生成序列
video_path = "path_to_zebrafish_video.mp4"
sequences, frames = preprocess_frames(video_path, lookback)
# 分割成訓練集和測試集
X_train, X_test = split_sequences(sequences)
# 定義早停策略
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)
# 訓練LSTM模型
history = lstm_model.fit(X_train, X_train,
epochs=50,
batch_size=16,
validation_data=(X_test, X_test),
callbacks=[early_stopping])
# 評估模型
train_loss = lstm_model.evaluate(X_train, X_train, verbose=0)
test_loss = lstm_model.evaluate(X_test, X_test, verbose=0)
print(f"訓練損失: {train_loss:.4f}, 測試損失: {test_loss:.4f}")
# 使用LSTM模型進行預測
predictions = lstm_model.predict(X_test)
# 計算預測誤差
mse = mean_squared_error(X_test.reshape(-1, num_features), predictions.reshape(-1, num_features))
print(f"預測均方誤差(MSE): {mse:.4f}")
# 可視化斑馬魚行為預測結果
for i, frame in enumerate(frames[len(X_train):]):
for j, (x, y, w, h) in enumerate(predictions[i]):
cv2.rectangle(frame, (int(x), int(y)), (int(x + w), int(y + h)), (255, 0, 0), 2)
cv2.imshow("Prediction", frame)
if cv2.waitKey(30) & 0xFF == ord('q'):
break
cv2.destroyAllWindows()
# 儲存LSTM模型
model_save_path = 'lstm_zebrafish_behavior.h5'
lstm_model.save(model_save_path)
print(f"LSTM模型已保存至: {model_save_path}")
import torch
import cv2
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.metrics import mean_squared_error
from yolov5 import YOLOv5
import os
torch
:用於支援YOLOv8模型。cv2
(OpenCV):用於處理視頻和影像資料。numpy
:用於數值運算和陣列操作。tensorflow.keras
:用於建立和訓練LSTM模型。yolov5
:用於導入YOLOv8模型以進行斑馬魚的物體偵測。os
:用於進行操作系統相關操作(例如,檔案路徑操作)。yolo_model = YOLOv5("yolov5s.pt", device="cuda") # 使用CUDA加速
YOLOv5("yolov5s.pt", device="cuda")
:加載YOLOv8模型,並指定在GPU(CUDA)上運行以加速運算。這裡使用的是一個預訓練的YOLOv8模型(yolov5s.pt
),假設它已經可以偵測斑馬魚。lookback = 9 # 定義LSTM的時間步長
num_features = 4 # 每個時間步的特徵數量: x, y, w, h (斑馬魚的位置和大小)
lstm_units = 100 # 每層LSTM單元的數量
dropout_rate = 0.2 # dropout比例
lookback
:LSTM模型將會考慮過去9幀的資訊來進行預測。這是時間序列分析中的一個重要參數。num_features
:每個時間步長中包含的特徵數量,這裡是4個特徵(x, y, w, h),分別代表斑馬魚的位置和大小。lstm_units
:每層LSTM的神經元數量,這裡設置為100個單元。dropout_rate
:Dropout層的比例,用於防止過擬合。lstm_model = Sequential()
# LSTM層
lstm_model.add(LSTM(lstm_units, return_sequences=True, input_shape=(lookback, num_features)))
lstm_model.add(Dropout(dropout_rate))
lstm_model.add(LSTM(lstm_units, return_sequences=True))
lstm_model.add(Dropout(dropout_rate))
lstm_model.add(LSTM(lstm_units, return_sequences=False))
lstm_model.add(Dropout(dropout_rate))
# 全連接層
lstm_model.add(Dense(50, activation='relu'))
lstm_model.add(Dense(num_features, activation='linear'))
# 編譯模型
lstm_model.compile(optimizer=Adam(learning_rate=0.001), loss='mse')
Sequential()
:使用Keras的順序模型來構建LSTM。LSTM
層:加入三層LSTM,每層都有100個神經元。前兩層使用return_sequences=True
來返回整個序列,這樣後續的LSTM層可以進一步處理時間序列資料。第三層只返回最終的輸出。Dropout
層:在每個LSTM層後加入Dropout,防止過擬合。Dense
層:最後加入兩層全連接層,其中一個使用ReLU激活函數,另一個使用線性激活函數來輸出預測的座標和尺寸。compile
:編譯模型,使用Adam優化器和均方誤差(MSE)作為損失函數。def preprocess_frames(video_path, lookback):
cap = cv2.VideoCapture(video_path)
frames = []
sequences = []
positions = []
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
# YOLOv8偵測
results = yolo_model.predict(frame)
detected_fish = results.xywh[results.class_names.index('zebrafish')] # 假設已經訓練好YOLO來偵測斑馬魚
if detected_fish is not None:
for fish in detected_fish:
x, y, w, h = fish[:4]
positions.append([x, y, w, h])
# 構建LSTM的輸入序列
if len(positions) >= lookback:
sequences.append(positions[-lookback:])
frames.append(frame)
cap.release()
return np.array(sequences), frames
cv2.VideoCapture(video_path)
:打開視頻檔案,並逐幀讀取。yolo_model.predict(frame)
:使用YOLOv8模型來偵測幀中的斑馬魚,並提取它們的位置和尺寸(x, y, w, h)。positions.append([x, y, w, h])
:將每幀中斑馬魚的位置和尺寸添加到positions
列表中。if len(positions) >= lookback:
:當累積的幀數達到lookback
值時,將這些位置資料形成一個序列,並添加到sequences
列表中。def split_sequences(sequences, split_ratio=0.8):
split = int(len(sequences) * split_ratio)
return sequences[:split], sequences[split:]
split_ratio
比例分割成訓練集和測試集。video_path = "path_to_zebrafish_video.mp4"
sequences, frames = preprocess_frames(video_path, lookback)
video_path
:視頻檔案的路徑。preprocess_frames
:處理視頻並生成時間序列資料和影像幀。X_train, X_test = split_sequences(sequences)
split_sequences
:將生成的時間序列資料分割成訓練集和測試集。early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)
history = lstm_model.fit(X_train, X_train,
epochs=50,
batch_size=16,
validation_data=(X_test, X_test),
callbacks=[early_stopping])
EarlyStopping
:設定早停策略,如果模型在驗證集上的損失在5個epoch內沒有改善,訓練將自動停止。lstm_model.fit
:開始訓練LSTM模型,設定的epoch數為50,批次大小為16。訓練過程中會使用早停策略來防止過擬合。train_loss = lstm_model.evaluate(X_train, X_train, verbose=0)
test_loss = lstm_model.evaluate(X_test, X_test, verbose=0)
print(f"訓練損失: {train_loss:.4f}, 測試損失: {test_loss:.4f}")
predictions = lstm_model.predict(X_test)
mse = mean_squared_error(X_test.reshape(-1, num_features), predictions.reshape(-1, num_features))
print(f"預測均方誤差(MSE): {mse:.4f}")
evaluate
:對訓練集和測試集進行損失評估,並輸出訓練損失和測試損失。predict
:使用訓練好的模型來預測測試集資料。mean_squared_error
:計算預測結果與實際值之間的均方誤差(MSE)。for i, frame in enumerate(frames[len(X_train):]):
for j, (x,
y, w, h) in enumerate(predictions[i]):
cv2.rectangle(frame, (int(x), int(y)), (int(x + w), int(y + h)), (255, 0, 0), 2)
cv2.imshow("Prediction", frame)
if cv2.waitKey(30) & 0xFF == ord('q'):
break
cv2.destroyAllWindows()
cv2.rectangle
:在預測位置畫矩形框,標示斑馬魚的位置。cv2.imshow
:顯示包含預測結果的幀。cv2.waitKey(30)
:每幀停留30毫秒,如果按下'q'鍵,則退出循環。model_save_path = 'lstm_zebrafish_behavior.h5'
lstm_model.save(model_save_path)
print(f"LSTM模型已保存至: {model_save_path}")
lstm_model.save
:將訓練好的LSTM模型保存到指定的檔案中,以便後續使用。這段程式碼展示了如何結合YOLOv8物體偵測模型和LSTM時間序列模型來進行多隻斑馬魚的行為分析。程式碼實現了從資料預處理、模型訓練到預測和可視化的完整流程,並進行了模型保存,便於後續分析使用。